智能论文笔记

Spectral Bandwidth Recovery of Optical Coherence Tomography Images using Deep Learning

Timothy T. Yu , Da Ma , Jayden Cole , Myeong Jin Ju , Mirza F. Beg , Marinko V. Sarunic

分类：人工智能 | 计算机视觉

2023-01-02

Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subsampled OCT data and more recently, deep-learning-based methods have been explored. In this study, we simulate reduced axial scan (A-scan) resolution by Gaussian windowing in the spectral domain and investigate the use of a learning-based approach for image feature reconstruction. In anticipation of the reduced resolution that accompanies wide-field OCT systems, we build upon super-resolution techniques to explore methods to better aid clinicians in their decision-making to improve patient outcomes, by reconstructing lost features using a pixel-to-pixel approach with an altered super-resolution generative adversarial network (SRGAN) architecture.

translated by 谷歌翻译

Exploring the Limits of Differentially Private Deep Learning with Group-wise Clipping

Jiyan He , Xuechen Li , Da Yu , Huishuai Zhang , Janardhan Kulkarni , Yin Tat Lee , Arturs Backurs , Nenghai Yu , Jiang Bian

分类：机器学习 | (统计)机器学习

2022-12-03

Differentially private deep learning has recently witnessed advances in computational efficiency and privacy-utility trade-off. We explore whether further improvements along the two axes are possible and provide affirmative answers leveraging two instantiations of \emph{group-wise clipping}. To reduce the compute time overhead of private learning, we show that \emph{per-layer clipping}, where the gradient of each neural network layer is clipped separately, allows clipping to be performed in conjunction with backpropagation in differentially private optimization. This results in private learning that is as memory-efficient and almost as fast per training update as non-private learning for many workflows of interest. While per-layer clipping with constant thresholds tends to underperform standard flat clipping, per-layer clipping with adaptive thresholds matches or outperforms flat clipping under given training epoch constraints, hence attaining similar or better task performance within less wall time. To explore the limits of scaling (pretrained) models in differentially private deep learning, we privately fine-tune the 175 billion-parameter GPT-3. We bypass scaling challenges associated with clipping gradients that are distributed across multiple devices with \emph{per-device clipping} that clips the gradient of each model piece separately on its host device. Privately fine-tuning GPT-3 with per-device clipping achieves a task performance at $\epsilon=1$ better than what is attainable by non-privately fine-tuning the largest GPT-2 on a summarization task.

translated by 谷歌翻译

Phase-Shifting Coder: Predicting Accurate Orientation in Oriented Object Detection

Yi Yu , Feipeng Da

分类：计算机视觉

2022-11-11

With the vigorous development of computer vision, oriented object detection has gradually been featured. In this paper, a novel differentiable angle coder named phase-shifting coder (PSC) is proposed to accurately predict the orientation of objects, along with a dual-frequency version PSCD. By mapping rotational periodicity of different cycles into phase of different frequencies, we provide a unified framework for various periodic fuzzy problems in oriented object detection. Upon such framework, common problems in oriented object detection such as boundary discontinuity and square-like problems are elegantly solved in a unified form. Visual analysis and experiments on three datasets prove the effectiveness and the potentiality of our approach. When facing scenarios requiring high-quality bounding boxes, the proposed methods are expected to give a competitive performance. The codes are publicly available at https://github.com/open-mmlab/mmrotate.

translated by 谷歌翻译

Segmentation-guided Domain Adaptation and Data Harmonization of Multi-device Retinal Optical Coherence Tomography using Cycle-Consistent Generative Adversarial Networks

Shuo Chen , Da Ma , Sieun Lee , Timothy T. L. Yu , Gavin Xu , Donghuan Lu , Karteek Popuri , Myeong Jin Ju , Marinko V. Sarunic , Mirza Faisal Beg

分类：计算机视觉 | 机器学习

2022-08-31

光学相干断层扫描（OCT）是一种非侵入性技术，可在微米分辨率中捕获视网膜的横截面区域。它已被广泛用作辅助成像参考，以检测与眼睛有关的病理学并预测疾病特征的纵向进展。视网膜层分割是至关重要的特征提取技术之一，其中视网膜层厚度的变化和由于液体的存在而引起的视网膜层变形高度相关，与多种流行性眼部疾病（如糖尿病性视网膜病）和年龄相关的黄斑疾病高度相关。变性（AMD）。但是，这些图像是从具有不同强度分布或换句话说的不同设备中获取的，属于不同的成像域。本文提出了一种分割引导的域适应方法，以将来自多个设备的图像调整为单个图像域，其中可用的最先进的预训练模型可用。它避免了即将推出的新数据集的手动标签的时间消耗以及现有网络的重新培训。网络的语义一致性和全球特征一致性将最大程度地减少许多研究人员报告的幻觉效果，这些效应对周期矛盾的生成对抗网络（Cyclegan）体系结构。

translated by 谷歌翻译

HTML版本

DA$^2$ Dataset: Toward Dexterity-Aware Dual-Arm Grasping

Guangyao Zhai , Yu Zheng , Ziwei Xu , Xin Kong , Yong Liu , Benjamin Busam , Yi Ren , Nassir Navab , Zhengyou Zhang

分类：机器人 | 计算机视觉

2022-07-31

在本文中，我们介绍了DA $^2 $，这是第一个大型双臂灵敏性吸引数据集，用于生成最佳的双人握把对，用于任意大型对象。该数据集包含大约900万的平行jaw grasps，由6000多个对象生成，每个对象都有各种抓紧敏度度量。此外，我们提出了一个端到端的双臂掌握评估模型，该模型在该数据集的渲染场景上训练。我们利用评估模型作为基准，通过在线分析和真实的机器人实验来显示这一新颖和非平凡数据集的价值。所有数据和相关的代码将在https://sites.google.com/view/da2dataset上开源。

translated by 谷歌翻译

A Simple Test-Time Method for Out-of-Distribution Detection

Ke Fan , Yikai Wang , Qian Yu , Da Li , Yanwei Fu

分类：计算机视觉 | 人工智能 | 机器学习

2022-07-17

已知神经网络在输入图像上产生过度自信的预测，即使这些图像不存在（OOD）样本。这限制了神经网络模型在存在OOD样本的实际场景中的应用。许多现有方法通过利用各种提示来确定OOD实例，例如在特征空间，逻辑空间，梯度空间或图像的原始空间中查找不规则模式。相反，本文提出了一种简单的测试时间线性训练（ETLT）用于OOD检测方法。从经验上讲，我们发现输入图像的概率不存在，与神经网络提取的功能令人惊讶地线性相关。具体来说，许多最先进的OOD算法虽然旨在以不同的方式衡量可靠性，但实际上导致OOD得分主要与其图像特征线性相关。因此，通过简单地学习从配对图像特征训练并在测试时间推断的OOD分数的线性回归模型，我们可以为测试实例做出更精确的OOD预测。我们进一步提出了该方法的在线变体，该变体可以实现有希望的性能，并且在现实世界中更为实用。值得注意的是，我们将FPR95从$ 51.37 \％$提高到CIFAR-10数据集的$ 12.30 \％$，最大的SoftMax概率是基本的OOD检测器。在几个基准数据集上进行的广泛实验显示了ETLT对OOD检测任务的功效。

translated by 谷歌翻译

Source-Free Domain Adaptation for Real-world Image Dehazing

Hu Yu , Jie Huang , Yajing Liu , Qi Zhu , Man Zhou , Feng Zhao

分类：计算机视觉 | 人工智能

2022-07-14

在合成数据集接受培训的基于深度学习的源脱掩护方法已经取得了显着的性能，但由于域移动而引起的真实朦胧图像的性能急剧下降。尽管已经提出了某些域的适应（DA）脱掩护方法，但它们不可避免地需要访问源数据集，以减少源合成和目标真实域之间的差距。为了解决这些问题，我们提出了一种新颖的无源无监督的域适应性（SFUDA）图像去悬式范式，其中只有训练有素的源模型和未标记的目标真实的朦胧数据集。具体而言，我们设计了域表示标准化（DRN）模块，以使真实朦胧域特征的表示与合成域的特征相匹配以弥合间隙。借助我们的插件DRN模块，未标记的真实朦胧图像可以调整现有训练有素的源网络。此外，还应用了无监督的损失来指导DRN模块的学习，该模块包括频率损失和物理先验损失。频率损失提供了结构和样式的约束，而先前的损失探讨了无雾图像的固有统计属性。现有的源脱去模型配备了我们的DRN模块和无监督的损失，能够脱光未标记的真实朦胧图像。在多个基层上进行的广泛实验证明了我们方法在视觉和定量上的有效性和优越性。

translated by 谷歌翻译

Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks

Huishuai Zhang , Da Yu , Yiping Lu , Di He

分类：机器学习 | 人工智能 | (统计)机器学习

2022-06-09

通常针对具有特定模型的特定输入而生成的对抗性示例，对于神经网络而言是无处不在的。在本文中，我们揭示了对抗声音的令人惊讶的属性，即，如果配备了相应的标签，则通过一步梯度方法制作的对抗性噪声是线性分离的。从理论上讲，我们为具有随机初始化条目的两层网络和神经切线内核设置证明了此属性，其中参数远离初始化。证明的想法是显示标签信息可以有效地反向输入，同时保持线性可分离性。我们的理论和实验证据进一步表明，对训练数据的对抗噪声进行训练的线性分类器可以很好地对测试数据的对抗噪声进行分类，这表明对抗性噪声实际上将分布扰动注入了原始数据分布。此外，我们从经验上证明，当上述条件受到损害时，在它们仍然比原始功能更容易分类时，对抗性的噪声可能会变得线性分离。

translated by 谷歌翻译

Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent

Da Yu , Gautam Kamath , Janardhan Kulkarni , Tie-Yan Liu , Jian Yin , Huishuai Zhang

分类：机器学习 | (统计)机器学习

2022-06-06

私人随机梯度下降（DP-SGD）是私人深度学习最新进展的主力算法。它为数据集中的所有数据点提供了单个隐私保证。我们提出了一种有效的算法，以在释放由DP-SGD培训的模型时计算单个示例的隐私保证。我们使用算法来研究许多数据集中的个人隐私参数。我们发现，大多数示例比最严重的案例拥有更强的隐私保证。我们进一步发现，训练损失和示例的隐私参数是非常相关的。这意味着在模型效用方面服务不足的群体在隐私保证方面同时服务不足。例如，在CIFAR-10上，测试准确性最低的课程的平均$ \ epsilon $比班级的平均$ \ epsilon $高26.3％。我们还运行会员推理攻击，以表明这反映了不同的经验隐私风险。

translated by 谷歌翻译

RecurSeed and EdgePredictMix: Single-stage Learning is Sufficient for Weakly-Supervised Semantic Segmentation

Sanghyun Jo , In Jae Yu , Kyungsu Kim

分类：计算机视觉 | 人工智能

2022-04-14

尽管仅使用图像级标签（WSSS-IL）仅使用图像级标签（WSSS-IL）弱监督的语义分割可能有用，但其低性能和实现复杂性仍然限制了其应用。主要原因是（a）非检测和（b）假检测现象：（a）从现有的WSSS-IL方法中完善的类激活图仍然仅表示大规模对象的部分区域，以及（b） - 规模对象，过度激活使它们偏离对象边缘。我们提出了反复进行的，该反复环境通过递归迭代交替减少非和错误的检测，从而隐含地找到了最大程度地减少这两个错误的最佳连接。我们还提出了一种称为EdgePredictMix的新型数据增强方法（DA）方法，该方法通过利用相邻像素之间的概率差异信息在结合分割结果时进一步表达了对象的边缘，从而在将现有的DA方法应用于WSS时，从而弥补了缺点。我们在Pascal VOC 2012和MS Coco 2014基准（VOC Val 74.4％，可可Val 46.4％）上实现了最先进的表演。该代码可从https://github.com/ofrin/recurseed_and_edgepredictmix获得。

translated by 谷歌翻译